Approximate earth mover ’ s distance in linear time Sameer

نویسندگان

  • Sameer Sheorey
  • David W. Jacobs
چکیده

The earth mover’s distance (EMD) [21] is an important perceptually meaningful metric for comparing histograms, but it suffers from high (O(n3 log n)) computational complexity. We present a novel linear time algorithm for approximating the EMD for low dimensional histograms using the sum of absolute values of the weighted wavelet coefficients of the difference histogram. EMD computation is a special case of the Kantorovich-Rubinstein transshipment problem, and we exploit the Hölder continuity constraint in its dual form to convert it into a simple optimization problem with an explicit solution in the wavelet domain. We prove that the resulting wavelet EMD metric is equivalent to EMD, i.e. the ratio of the two is bounded. We also provide estimates for the bounds. The weighted wavelet transform can be computed in time linear in the number of histogram bins, while the comparison is about as fast as for normal Euclidean distance or χ2 statistic. We experimentally show that wavelet EMD is a good approximation to EMD, has similar performance, but requires much less computation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CAR - TR - 1025 CS - TR - 4908 UMIACS - TR - 2008 - 06 21 February 2008 Approximate earth mover ’ s distance in linear time Sameer

The earth mover’s distance (EMD) [19] is an important perceptually meaningful metric for comparing histograms, but it suffers from high (O(n3 log n)) computational complexity. We present a novel linear time algorithm for approximating the EMD for low dimensional histograms using the sum of absolute values of the weighted wavelet coefficients of the difference histogram. EMD computation is a spe...

متن کامل

On constant factor approximation for earth mover distance over doubling metrics

Given a metric space (X, dX), the earth mover distance between two distributions over X is defined as the minimum cost of a bipartite matching between the two distributions. The doubling dimension of a metric (X, dX) is the smallest value α such that every ball in X can be covered by 2 ball of half the radius. A metric (or a sequence of metrics) is called doubling precisely if its doubling dime...

متن کامل

Approximate earth mover's distance in linear time

Theorem 2 ([3]). f ∈ Lloc(R n), belongs to Cs(Rn) if and only if, in a wavelet decomposition of regularity r ≥ 1 > s there exist constants C0 and C1 such that, Approx. coeffs.: |fk| ≤ C0, k ∈ Z n and Detail coeffs.: |fλ| ≤ C12 −j(n/2+s), λ ∈ Λj, j ≥ 0 (1) Lemma 1. If 0 < s < 1 and (1) holds, then f ∈ Cs(Rn) with CH(f ) < C such that a12(ψ; s)C1 ≤ C ≤ a21(ψ; s)C0 + a22(ψ; s)C1 (2) For discrete d...

متن کامل

Sketching Earth-Mover Distance on Graph Metrics

We develop linear sketches for estimating the Earth-Mover distance between two point sets, i.e., the cost of the minimum weight matching between the points according to some metric. While Euclidean distance and Edit distance are natural measures for vectors and strings respectively, Earth-Mover distance is a well-studied measure that is natural in the context of visual or metric data. Our work ...

متن کامل

Improved Approximation Algorithms for Earth-Mover Distance in Data Streams

For two multisets S and T of points in [∆], such that |S| = |T | = n, the earth-mover distance (EMD) between S and T is the minimum cost of a perfect bipartite matching with edges between points in S and T , i.e., EMD(S, T ) = minπ:S→T ∑ a∈S ||a−π(a)||1, where π ranges over all one-to-one mappings. The sketching complexity of approximating earth-mover distance in the two-dimensional grid is men...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008